Introducing Multiple Pronunciations in Spanish Speech Recognition Systems
نویسندگان
چکیده
Pronunciation variations are common sources of recognition errors in real-world applications, so that specific techniques must be developed to handle them. We are describing a method to incorporate pronunciation alternatives that have been tested with both continuous and isolated word speech recognisers for Spanish. We present an automatic grapheme-tophoneme system, modified to generate alternate pronunciations. It works according to phonological rules manually developed using certain variations, well known in the linguistic community but not widely exploited in the Spanish speech recognition arena. We will apply this strategy only to the recognition stage of both a continuous speech recogniser for clean speech data, and an isolated one for a telephone environment task. We will report improvements up to 20% decrease in error rate, for the continuous speech task, while for the isolated word recognition task, no significant effect has been found. We will conclude analysing which effects have led to these results and discuss future work to be done.
منابع مشابه
Improving continuous speech recognition in Spanish by phone-class semicontinuous HMMs with pausing and multiple pronunciations
This paper presents a comprehensive study of continuous speech recognition in Spanish. It shows the use and optimisation of several well-known techniques together with the application for the ®rst time to Spanish of language speci®c knowledge to these systems, i.e. the careful selection of the phone inventory, the phone-classes used, and the selection of alternative pronunciation rules. We have...
متن کاملWiktionary as a source for automatic pronunciation extraction
In this paper, we analyze whether dictionaries from the World Wide Web which contain phonetic notations, may support the rapid creation of pronunciation dictionaries within the speech recognition and speech synthesis system building process. As a representative dictionary, we selected Wiktionary [1] since it is at hand in multiple languages and, in addition to the definitions of the words, many...
متن کاملAutomatic Pronunciation Dictionary Generation from Wiktionary and Wikipedia
In this work we show that dictionaries from the World Wide Web which contain phonetic notations may represent a good basis for the rapid pronunciation dictionary creation within the speech recognition and speech synthesis system building process. As a representative dictionary, we selected wiktionary.org [1] since it is available in multiple languages, and in addition to the definitions of the ...
متن کاملImproved lexicon formation through removal of co-articulation and acoustic recognition errors
It is becoming increasingly more necessary that speech recognition systems contain an accurate lexicon, consisting of likely word pronunciations that actually occur within a given domain. Given the increasing size of speech databases, it would appear that data driven approaches are best suited to derive such pronunciations. Presently, however, such an approach often introduces implausible pronu...
متن کاملRate-of-speech Modeling for Large Vocabulary Conversational Speech Recognition
Variations in rate of speech (ROS) produce changes in both spectral features and word pronunciations that affect automatic speech recognition (ASR) systems. To deal with these ROS effects, we propose to use parallel, rate-specific, acoustic models: one for fast speech, the other for slow speech. Rate switching is permitted at word boundaries, to allow modeling within-sentence speech rate variat...
متن کامل